Equivalence of Optimality Criteria for Markov Decision Process and Model Predictive Control
نویسندگان
چکیده
This paper shows that the optimal policy and value functions of a Markov Decision Process (MDP), either discounted or not, can be captured by finite-horizon undiscounted Optimal Control Problem (OCP), even if based on an inexact model. achieved selecting proper stage cost terminal for OCP. A very useful particular case OCP is Model Predictive (MPC) scheme where deterministic (possibly nonlinear) model used to reduce computational complexity. observation leads us parameterize MPC fully, including function. In practice, Reinforcement Learning algorithms then tune parameterized scheme. We verify developed theorems analytically in LQR we investigate some other nonlinear examples simulations.
منابع مشابه
Improved Optimization Process for Nonlinear Model Predictive Control of PMSM
Model-based predictive control (MPC) is one of the most efficient techniques that is widely used in industrial applications. In such controllers, increasing the prediction horizon results in better selection of the optimal control signal sequence. On the other hand, increasing the prediction horizon increase the computational time of the optimization process which make it impossible to be imple...
متن کاملLearning-based model predictive control for Markov decision processes
We propose the use of Model Predictive Control (MPC) for controlling systems described by Markov decision processes. First, we consider a straightforward MPC algorithm for Markov decision processes. Then, we propose value functions, a means to deal with issues arising in conventional MPC, e.g., computational requirements and sub-optimality of actions. We use reinforcement learning to let an MPC...
متن کاملContinuous-time Markov decision processes with nth-bias optimality criteria
In this paper, we study the nth-bias optimality problem for finite continuous-time Markov decision processes (MDPs) with a multichain structure. We first provide nth-bias difference formulas for two policies and present some interesting characterizations of an nth-bias optimal policy by using these difference formulas. Then, we prove the existence of an nth-bias optimal policy by using nth-bias...
متن کاملConstrained model predictive control: Stability and optimality
Model predictive control is a form of control in which the current control action is obtained by solving, at each sampling instant, a "nite horizon open-loop optimal control problem, using the current state of the plant as the initial state; the optimization yields an optimal control sequence and the "rst control in this sequence is applied to the plant. An important advantage of this type of c...
متن کاملOn optimality of nonlinear model predictive control
In this note the Infinite Horizon (IH) optimality property of Nonlinear Model Predictive Control (MPC) is analysed. In particular it is shown with a contra example that the conjecture that the IH cost of the closedloop system controlled with a stabilizing MPC controller is a monotonic decreasing function of the optimization horizon is fallacius.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Automatic Control
سال: 2023
ISSN: ['0018-9286', '1558-2523', '2334-3303']
DOI: https://doi.org/10.1109/tac.2023.3277309